Exploring data driven parametric synthesis
نویسندگان
چکیده
This paper describes our work on building a formant synthesis system based on both rule generated and database driven methods. Three parametric synthesis systems are discussed: our traditional rule based system, a speaker adapted system, and finally a gesture system. The gesture system is a further development of the adapted system in that it includes concatenated formant gestures from a data-driven unit library. The systems are evaluated technically, comparing the formant tracks with an analysed test corpus. The gesture system results in a 25% error reduction in the formant frequencies due to the inclusion of the stored gestures. Finally, a perceptual evaluation shows a clear advantage in naturalness for the gesture system compared to both the traditional system and the speaker adapted system.
منابع مشابه
Conceptual estimating tool for technology-driven projects: exploring parametric estimating technique
This paper examines a parametric estimating technique applied to technology-driven projects. Parametric cost estimating is a widely used approach for bidding on a contract, input into a cost benefit analysis, or as the pre-planning tool for project implementation. Extensive literature reviews suggest that effective parametric estimating methodology is becoming an essential tool for technology-d...
متن کاملA Deep Learning Approach to Data-driven Parameterizations for Statistical Parametric Speech Synthesis
Nearly all Statistical Parametric Speech Synthesizers today use Mel Cepstral coefficients as the vocal tract parameterization of the speech signal. Mel Cepstral coefficients were never intended to work in a parametric speech synthesis framework, but as yet, there has been little success in creating a better parameterization that is more suited to synthesis. In this paper, we use deep learning a...
متن کاملFrom diphones to allophones: from data to rules
A research project is presented in which we aim to design a speech synthesis model based on both the diphone and the allophone concepts, i.e. the data-driven and rule-driven approach for speech synthesis, respectively. At present, diphone concatenation for Dutch Ieads to more intelligible speech than when a rule-based allophone synthesis is applied, although the latter synthesis has the theoret...
متن کاملA linguistic and prosodic database for data-driven Japanese TTS synthesis
We propose a method to generate a database that contains a parametric representation of F0 contours associated with linguistic and acoustic information, to be used by data-driven Japanese text-to-speech (TTS) systems. The configuration of the database includes recorded speech, F0 contours and their parametric labels, phonetic transcription with durations, and other linguistic information such a...
متن کاملGeneration of F0 contours using a model-constrained data-driven method
This paper introduces a novel model-constrained, data-driven method for generating fundamental frequency contours in Japanese text-to-speech synthesis. In the training phase, the parameters of a command-response F0 contour generation model are learned by a prediction module, which can be a neural network or a set of binary regression trees. The input features consist of linguistic information r...
متن کامل